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Detailed protocol 


Experimental design and study procedures 

This was a prospective cohort study involving a convenience sample of previously healthy children <2 years of age with acute RSV infection, and a 
cohort of healthy, asymptomatic age-matched controls. The study was conducted at Nationwide Children’s Hospital (NCH; Columbus, OH) from 2011 to 
2015. Children were enrolled at NCH urgent care clinics, the emergency department (ED), or in the inpatient hospital units (ward or PICU) within 24 (17- 
38) hours of hospitalization, and blood and nasopharyngeal samples obtained. RSV was diagnosed per standard of care using a rapid antigen assay or PCR 
test. Healthy controls were enrolled during well-child visits or minor elective surgical procedures not involving the respiratory tract (13). Exclusion criteria 
included: documented bacterial co-infections, premature birth(<36 weeks of gestation), congenital or chronic medical conditions, and immunodeficiency. 
For healthy controls, additional exclusion criteria included presence of fever or symptoms of respiratory tract infection within two weeks of enrollment. 

Blood samples were analyzed for: 1) transcriptome; 2) cell immunophenotype; and 3) white blood cell count (WBC) with differential. Samples were 
drawn in this order but due to the small blood volumes obtained in some children, we were unable to obtain all samples in all participants. Nasopharyngeal 
(NP) swabs were collected in RSV inpatients and outpatients at enrollmentby trained study personnel using an approach that has been validated in 
multicenter trials(33, 34). To confirm and quantify RSV loads, qRT-PCR analyses targeting the N gene were performed in NP samples of all children included 
in the study (14). Other respiratory viruses were detected using PCR assays (either Luminex xTag Respiratory Panel, Austin, TX; or the Filmarray 
respiratory viral panel, BioFire, BioMerieux, Durham, NC) (23). 

Demographic and clinical parameters were collected using a standardized questionnaire and by reviewing electronic healthcare records. Disease 
severity was assessed primarily by the need for hospitalization (inpatients [=severe disease] vs. outpatients [=mild disease]) and using a standardized 
clinical disease severity score (CDSS) (14, 23). In hospitalized patients, additional parameters of severity were collected including administration of 
supplemental oxygen, PICU admission, and total duration of hospitalization. In addition, families were contacted 2 and 4 weeks after enrollment to confirm 
the lack of subsequent readmissions in the inpatient group or hospitalization in the outpatient cohort. 

Study objectives 

We sought to define the clinical, viral loads and immune profile differences between children with mild RSV (outpatients) vs. severe RSV infection 
(hospitalized) across different ages. We hypothesized that RSV viral loads and host immune responses would differ according to disease severity and age at 
the time of the infection. Specifically, we hypothesized that children with mild infection would mount a more effective innate immune response that will be 
associated with protection from developing severe disease. 
Transcriptome analyses 
Sample collection, processing and RNA hybridization 

Blood was collected in Tempus tubes (Applied Biosystems, Foster City, CA) and stored within 2-4 hours at -20°C until further analyses. RNA was 
extracted, processed and hybridized into IIlumina Human HT-12 v4 microarray chips (47,323 probes; Illumina, San Diego, CA) as described (13). After 
hybridization data was scanned on Illumina Beadstation 500. Illumina GenomeStudio software was used for background subtraction and to scale average 
signal intensities. 
Data pre-processing 

We first selected transcripts that were ‘present’ (signal precision <0.01) in 210% of samples (PAL10%; 18,213 transcripts) as described (13, 35). 


Next, raw expression values <10 were set to 10 and the data logy-transformed. Using PCA and principal variance component analysis (PVCA) in R (36)we 


identified a technical batch effect within the discovery cohort that was associated with globin reduction (Fig S4A). The batch effect was corrected using the 


ComBat function of the SVA package in R (37, 38)(Fig S4B). 


i 


Differential gene expression analysis (transcriptional signatures) 

Two datasets were used for transcriptome analyses. We first derived the transcriptional signature for mild (outpatients) and severe (inpatients) 
RSV infection by comparing each RSV cohort to the same healthy controls in the discovery cohort. Patients for this cohort were matched for age and gender. 
We used the limma package in R (39)and applied stringent statistical filtering (FDR p<0.01, Benjamini-Hochberg multiple test correction and 1.25-fold 
change). The transcriptional signatures identified were validated with PCA in a not age-matched validation cohortthat included RSV outpatients, inpatients 
and healthy controls (Figure Z2B&D). The two cohorts were independent except for four healthy controls whose samples were hybridized twice and used in 
the discovery and validation cohorts (Fig 1). The samples included in the study were processed in two different batches (Fig $5) which largely overlapped 
in terms of enrollment (batch#1: 2012-2014, and batch#2: 2012-2015). To avoid any potential batch effect, samples included in the discovery cohorts were 
hybridized in batch #1, while those used for validation were hybridized in batch #2. 

Functional characterization of differentially expressed transcripts 

To define the biological function of the “mild” and “severe” RSV signatures, we applied modular transcriptional analyses. Briefly, this is a systems 
scale that aims to reduce the abundance of transcriptional data into functional pathways or modules. Transcriptional modules are formed by genes 
coordinately expressed, thus allowing functional interpretation of the microarray data into biologically useful information. Modular over- and under- 
expression was defined by the percentage of transcripts within each module that were differentially expressed in RSV outpatients and inpatients vs. healthy 
controls. A detailed description of this mining analysis strategy has been reported elsewhere (19). 

Modular maps for the outpatient and inpatient discovery cohorts were derived by comparison with the same age-matched healthy controls (FDR 
p<0.05, Benjamini-Hochberg multiple test correction). Results were confirmed by a) computing correlations of modular expression between the discovery 
and validation cohorts (Spearman’s correlation coefficient (Fig. 3B); b) by DAVID bioinformatics tool (version 6.7, available at https://david- 
d.ncifcrf.gov) (40), and c) Ingenuity Pathway Analysis tool (IPA; QIAGEN, Redwood City, CA, USA; Table $7). 

Age analyses using modular expression were performed in children <6 months vs. 6-24 months of age according to disease severity (outpatients vs. 


inpatients) using chi-square test and adjusted for multiple comparisons by Benjamini-Hochberg multiple test (Table $5). 


Blood immune cell populations 

Blood samples (1-2 ml) were obtained in ACD tubes (BD vacutainer ACD Solution B; BD, Franklin Lakes, NJ) and processed within 2-4 hours of 
collection. Five blood aliquots of ~200 uL each, were stained with different antibody panels for characterization of innate (neutrophils, monocytes-- 
including HLA-DR low monocytes as a functional marker of activation--, NK cells, DCs), and adaptive immune cell populations (T cells, peripheral T 
follicular-like helper cells (Tfh), and B cells; Tables $11&12). Samples were incubated with antibody panels for 15 minutes, and red blood cells lysed with 
BD FACS lysing solution (BD Biosciences, San Jose, CA). Stained cells were washed twice with phosphate-buffered saline (PBS) before storage at 4°C until 
cell acquisition, which was performed in 1-3 days on a LSRII flow cytometry instrument (BD Biosciences, San Jose, CA). Data were analyzed using FlowJo 
software v9.8.2 (Tree Star, Ashland, OR). Data are presented as absolute numbers (Table 2; Fig 5) in children with total WBC data available (89%; 93/104 
children), and percentages (Table $10). Individial patient data is provided in Table S1. We used immune cell counts for all analyses except to compute 
correlations between immune cell populations and modular gene expression, in which we used percentages because of the lack of WBC count data in 5 of 


37 patients with paired data. 


Sample size calculations and Statistical analyses 

For sample size calculation, best practices in the transcriptome field dictate utilization of at least two independent sets of samples for the purpose of 
validating candidate signatures (or profiles). In previous studies in individuals with acute infections, others and we have obtained robust profiles using 
groups of 15-20 subjects per group (13, 35, 38, 41, 42). We analyzed demographic and clinical data using SPSS v22.0 (IBM Corp, Armonk, NY) and Graphpad 
Prism v7.0b (La Jolla, CA) software packages, that was presented as medians with 25%-75% interquartile ranges (IQRs). We compared continuous 
variables using either Mann-Whitney or Kruskal-Wallis test with Dunn’s or Benjamini-Hochberg post hoc tests for multiple testing, and categorical variables 
using either chi-square or Fisher’s exact test. 

For transcriptome and cellular immunophenotype data we used R (R Foundation for Statistical Computing, Vienna, Austria). The statistical tools 
used for microarray analyses are included in each of the above sections. For correlations between transcriptomic and cellular, clinical or virology data we 
usedSpearman’s correlation coefficient. 

Last, we analyzed whether the main immune variables found to be relevant in the study (IFN expression & HLA-DR low monocyte numbers) were 
associated with hospitalization. To this end we conducted multivariable analyses using logistic regression with Firth’s correction for small sample size, after 
accounting for other factors. Due to the limited sample size, additional covariates were added in separate models, so that a maximum of two risk factors 
were included within any given model. Analyses were conducted using SAS 9.4 (SAS Institute, Cary, NC) with two-sided p-values <0.05 considered 


statistically significant. 
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